Estimating Uncertainty of Categorical Web Data

نویسندگان

  • Davide Ceolin
  • Willem Robert van Hage
  • Wan Fokkink
  • Guus Schreiber
چکیده

Web data often manifest high levels of uncertainty. We focus on categorical Web data and we represent these uncertainty levels as first or second order uncertainty. By means of concrete examples, we show how to quantify and handle these uncertainties using the BetaBinomial and the Dirichlet-Multinomial models, as well as how take into account possibly unseen categories in our samples by using the Dirichlet Process.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uncertainty Estimation and Analysis of Categorical Web Data

Web data often manifest high levels of uncertainty. We focus on categorical Web data and we represent these uncertainty levels as firstor second-order uncertainty. By means of concrete examples, we show how to quantify and handle these uncertainties using the BetaBinomial and the Dirichlet-Multinomial models, as well as how take into account possibly unseen categories in our samples by using th...

متن کامل

A Comparative Study of Performance of Adaptive Web Sampling and General Inverse Adaptive Sampling in Estimating Olive Production in Iran

Nowadays, there is an increasing use of sampling methods in network and spatial populations. Although the most common link-tracing designs such as adaptive cluster sampling and snowball sampling have advantages over conventional sampling designs such as simple random sampling and cluster sampling, these designs still present many drawbacks. Adaptive web sampling is a new link-tracing design tha...

متن کامل

SSDR: An Algorithm for Clustering Categorical Data Using Rough Set Theory

In the present day scenario, there are large numbers of clustering algorithms available to group objects having similar characteristics. But the implementations of many of those algorithms are challenging when dealing with categorical data. While some of the algorithms available at present cannot handle categorical data the others are unable to handle uncertainty. Many of them have the stabilit...

متن کامل

Categorical models for spatial data uncertainty

Considerable disparity exists between the current state of the art for categorical spatial data error modeling and the current state of the practice for reporting categorical data quality. On one hand, the general Monte Carlo simulation-based error propagation framework is a fixture in spatial data error handling; researchers have identified potentially powerful approaches to characterizing cat...

متن کامل

MMR: An algorithm for clustering categorical data using Rough Set Theory

A variety of cluster analysis techniques exist to group objects having similar characteristics. However, the implementation of many of these techniques is challenging due to the fact that much of the data contained in today’s databases is categorical in nature. While there have been recent advances in algorithms for clustering categorical data, some are unable to handle uncertainty in the clust...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011